Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback

Published in EMNLP 2023 findings, 2023